Optimising the Volgenant-Jonker algorithm for approximating graph edit distance

نویسندگان

  • William Jones
  • Aziem Chawdhary
  • Andy King
چکیده

Although it is agreed that the Volgenant-Jonker (VJ) algorithm provides a fast way to approximate graph edit distance (GED), until now nobody has reported how the VJ algorithm can be tuned for this task. To this end, we revisit VJ and propose a series of refinements that improve both the speed and memory footprint without sacrificing accuracy in the GED approximation. We quantify the effectiveness of these optimisations by measuring distortion between control-flow graphs: a problem that arises in malware matching. We also document an unexpected behavioural property of VJ in which the time required to find shortest paths to unassigned vertices decreases as graph size increases, and explain how this phenomenon relates to the birthday paradox. c © 2016 Elsevier Ltd. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Revisiting Volgenant-Jonker for Approximating Graph Edit Distance

Although it is agreed that the Volgenant-Jonker (VJ) algorithm provides a fast way to approximate graph edit distance (GED), until now nobody has reported how the VJ algorithm can be tuned for this task. To this end, we revisit VJ and propose a series of refinements that improve both the speed and memory footprint without sacrificing accuracy in the GED approximation. We quantify the effectiven...

متن کامل

Optimising Tree Edit Distance with Subtrees for Textual Entailment

This paper introduces a method for improving tree edit distance (TED) for textual entailment. We explore two ways of improving TED: we extend the standard TED to use edit operations that apply to subtrees as well as to single nodes; and we use the ‘artificial bee colony’ algorithm (ABC) to estimate the cost of edit operations for single nodes and subtrees and to determine thresholds. The prelim...

متن کامل

Comparing Stars: On Approximating Graph Edit Distance

Graph data have become ubiquitous and manipulating them based on similarity is essential for many applications. Graph edit distance is one of the most widely accepted measures to determine similarities between graphs and has extensive applications in the fields of pattern recognition, computer vision etc. Unfortunately, the problem of graph edit distance computation is NP-Hard in general. Accor...

متن کامل

Approximating Graph Edit Distance Using GNCCP

The graph edit distance (GED) is a flexible and widely used dissimilarity measure between graphs. Computing the GED between two graphs can be performed by solving a quadratic assignment problem (QAP). However, the problem is NP complete hence forbidding the computation of the optimal GED on large graphs. To tackle this drawback, recent heuristics are based on a linear approximation of the initi...

متن کامل

Approximating optimal solution structure with edit distance and its applications

An alternative notion of approximation arising in cognitive psychology, bioinformatics and linguistics is that of computing a solution which is structurally close to an optimal one. That is, an approximate solution is considered good if its distance from an optimal solution is small, for a distance measure such as Hamming distance or edit distance. There has been a modicum of work on approximat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Pattern Recognition Letters

دوره 87  شماره 

صفحات  -

تاریخ انتشار 2017